二次回归学习及其在软件开发工作量预测上的应用<sup>*</sup>

doi:10.16451/j.cnki.issn1003-6059.201501008

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (452 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要回归学习是用于对具有实值标记样本进行学习建模的监督学习技术.为获得良好的预测性能，通常需要大量的训练样本，然而，在实际应用中可收集到的训练样本数量极少.针对该问题，提出一种基于二次学习框架的新型二次回归学习方法——基于神经网络集成的回归树算法(NERT).该方法借助虚拟样本生成技术，通过串行执行的两个学习阶段对其进行有效利用，有效缓解训练样本不足的困难，从而提升学习性能.同时，通过为两个阶段分别选择泛化能力强和理解性好的学习方法，可得到预测性能好且可理解性高的模型.实验结果表明在训练样本极少的软件开发工作量预测问题上，NERT方法能够从小样本数据得到比现有方法更好的预测性能，同时其模型内在可理解性能够揭示工作量预测的关键因素.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	杨子旭
	黎铭

关键词 ：回归分析, 机器学习, 二次回归学习, 软件挖掘, 工作量预测

Abstract：Regression learning belongs to supervised learning, which is to build models on examples with real-valued labels. It usually needs a great amount of training samples to obtain significant performance. However, there are few training samples that can be collected in real applications. Aiming at this problem, the neural network ensemble to regression tree(NERT) algorithm is proposed based on the twice learning framework. By means of the virtual sample generation technique, this method makes effective utilization of two sequential learning stages to relieve the problem of insufficient training samples for enhancing its performance. By choosing two methods with high generalization ability and significant comprehensibility respectively for the two stages, a model with two characteristics can be obtained. Results on software effort estimation with few training samples show that NERT is capable of achieving better performance from these small data than existing methods, and reveals the key factors within effort estimation effectively due to its inherent comprehensibility.

Key words： Regression Analysis Machine Learning Twice Regression Learning Software Mining Software Effort Estimation

收稿日期: 2013-05-13

ZTFLH:

TP 181

基金资助:国家自然科学基金项目(No.61272217)、教育部新世纪人才支持计划项目(No.NCET-13-0275)、江苏省自然科学基金项目(No.BK20131278) 资助

作者简介: 杨子旭，男，1989年生，硕士研究生，主要研究方向为机器学习、数据挖掘.E-mail:yangzx@lamda.nju.edu.cn.黎铭(通讯作者)，男，1980年生，博士，副教授，主要研究方向为机器学习、软件挖掘等.E-mail:lim@lamda.nju.edu.cn.

引用本文:

杨子旭，黎铭. 二次回归学习及其在软件开发工作量预测上的应用^*[J]. 模式识别与人工智能, 2015, 28(1): 59-64. YANG Zi-Xu, LI Ming. Twice Regression Learning and Its Application on Software Effort Estimation. , 2015, 28(1): 59-64.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.201501008 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2015/V28/I1/59